The emergence of low-cost, small form factor and light-weight solid-state LiDAR sensors have brought new opportunities for autonomous unmanned aerial vehicles (UAVs) by advancing navigation safety and computation efficiency. Yet the successful developments of LiDAR-based UAVs must rely on extensive simulations. Existing simulators can hardly perform simulations of real-world environments due to the requirements of dense mesh maps that are difficult to obtain. In this paper, we develop a point-realistic simulator of real-world scenes for LiDAR-based UAVs. The key idea is the underlying point rendering method, where we construct a depth image directly from the point cloud map and interpolate it to obtain realistic LiDAR point measurements. Our developed simulator is able to run on a light-weight computing platform and supports the simulation of LiDARs with different resolution and scanning patterns, dynamic obstacles, and multi-UAV systems. Developed in the ROS framework, the simulator can easily communicate with other key modules of an autonomous robot, such as perception, state estimation, planning, and control. Finally, the simulator provides 10 high-resolution point cloud maps of various real-world environments, including forests of different densities, historic building, office, parking garage, and various complex indoor environments. These realistic maps provide diverse testing scenarios for an autonomous UAV. Evaluation results show that the developed simulator achieves superior performance in terms of time and memory consumption against Gazebo and that the simulated UAV flights highly match the actual one in real-world environments. We believe such a point-realistic and light-weight simulator is crucial to bridge the gap between UAV simulation and experiments and will significantly facilitate the research of LiDAR-based autonomous UAVs in the future.
translated by 谷歌翻译
Automated detecting lung infections from computed tomography (CT) data plays an important role for combating COVID-19. However, there are still some challenges for developing AI system. 1) Most current COVID-19 infection segmentation methods mainly relied on 2D CT images, which lack 3D sequential constraint. 2) Existing 3D CT segmentation methods focus on single-scale representations, which do not achieve the multiple level receptive field sizes on 3D volume. 3) The emergent breaking out of COVID-19 makes it hard to annotate sufficient CT volumes for training deep model. To address these issues, we first build a multiple dimensional-attention convolutional neural network (MDA-CNN) to aggregate multi-scale information along different dimension of input feature maps and impose supervision on multiple predictions from different CNN layers. Second, we assign this MDA-CNN as a basic network into a novel dual multi-scale mean teacher network (DM${^2}$T-Net) for semi-supervised COVID-19 lung infection segmentation on CT volumes by leveraging unlabeled data and exploring the multi-scale information. Our DM${^2}$T-Net encourages multiple predictions at different CNN layers from the student and teacher networks to be consistent for computing a multi-scale consistency loss on unlabeled data, which is then added to the supervised loss on the labeled data from multiple predictions of MDA-CNN. Third, we collect two COVID-19 segmentation datasets to evaluate our method. The experimental results show that our network consistently outperforms the compared state-of-the-art methods.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
translated by 谷歌翻译
视频中的战斗检测是当今监视系统和流媒体的流行率的新兴深度学习应用程序。以前的工作主要依靠行动识别技术来解决这个问题。在本文中,我们提出了一种简单但有效的方法,该方法从新的角度解决了任务:我们将战斗检测模型设计为动作感知功能提取器和异常得分生成器的组成。另外,考虑到视频收集帧级标签太费力了,我们设计了一个弱监督的两阶段训练计划,在此我们使用在视频级别标签上计算出的多个实体学习损失来培训得分生成器,并采用自我训练的技术以进一步提高其性能。在公开可用的大规模数据集(UBI-Fights)上进行了广泛的实验,证明了我们方法的有效性,并且数据集的性能超过了几种先前的最先进的方法。此外,我们收集了一个新的数据集VFD-2000,该数据集专门研究视频战斗检测,比现有数据集更大,场景更大。我们的方法的实现和拟议的数据集将在https://github.com/hepta-col/videofightdetection上公开获得。
translated by 谷歌翻译
在本文中,我们解决了未知和非结构化环境中在线四型全身运动计划(SE(3)计划)的问题。我们提出了一种新颖的多分辨率搜索方法,该方法发现了需要完整的姿势计划和仅需要位置计划的正常区域的狭窄区域。结果,将四型计划问题分解为几个SE(3)(如有必要)和R^3子问题。为了飞过发现的狭窄区域,提出了一个精心设计的狭窄区域的走廊生成策略,这大大提高了计划的成功率。总体问题分解和分层计划框架大大加速了计划过程,使得可以在未知环境中进行完全的板载感应和计算在线工作。广泛的仿真基准比较表明,所提出的方法的数量级比计算时间中最先进的方法快,同时保持高计划成功率。最终将所提出的方法集成到基于激光雷达的自主四旋转器中,并在未知和非结构化环境中进行了各种现实世界实验,以证明该方法的出色性能。
translated by 谷歌翻译
准确的自我和相对状态估计是完成群体任务的关键前提,例如协作自主探索,目标跟踪,搜索和救援。本文提出了一种全面分散的状态估计方法,用于空中群体系统,其中每个无人机执行精确的自我状态估计,通过无线通信交换自我状态和相互观察信息,并估算相对状态(W.R.T.)(W.R.T.)无人机,全部实时,仅基于激光惯性测量。提出了一种基于3D激光雷达的新型无人机检测,识别和跟踪方法,以获得队友无人机的观察。然后,将相互观察测量与IMU和LIDAR测量紧密耦合,以实时和准确地估计自我状态和相对状态。广泛的现实世界实验显示了对复杂场景的广泛适应性,包括被GPS贬低的场景,摄影机的退化场景(漆黑的夜晚)或激光雷达(面对单个墙)。与运动捕获系统提供的地面真相相比,结果显示了厘米级的定位精度,该精度优于单个无人机系统的其他最先进的激光惯性射测。
translated by 谷歌翻译
您将如何修复大量错过的物理物体?您可能首先恢复其全球且粗糙的形状,并逐步增加其本地细节。我们有动力模仿上述物理维修程序,以解决点云完成任务。我们为各种3D模型提出了一个新颖的逐步点云完成网络(SPCNET)。 SPCNET具有层次的底部网络体系结构。它以迭代方式实现形状完成,1)首先扩展了粗糙结果的全局特征; 2)然后在全球功能的帮助下注入本地功能; 3)最终借助局部特征和粗糙的结果来渗透详细的结果。除了模拟物理修复的智慧之外,我们还新设计了基于周期损失%的训练策略,以增强SPCNET的概括和鲁棒性。广泛的实验清楚地表明了我们的SPCNET优于3D点云上最先进的方法,但错过了很大。
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
随着对安全至关重要系统中的机器学习技术的兴趣的增加,外部干扰下的神经网络的鲁棒性越来越多。全局鲁棒性是整个输入域上定义的鲁棒性属性。并且经过认证的全球稳健网络可以确保其在任何可能的网络输入上的稳健性。但是,最先进的全球鲁棒性认证算法只能与最多几千个神经元进行认证。在本文中,我们提出了GPU支持的全球鲁棒性认证框架杂货店,该框架比以前基于优化的认证方法更有效。此外,Grocet提供了可区分的全球鲁棒性,这是在全球强大神经网络的培训中利用的。
translated by 谷歌翻译